|     etl streamset etl streamset   |    

DataX 安装配置

git clone https://github.com/alibaba/DataX.git
cd DataX
mvn -U clean package assembly:assembly -Dmaven.test.skip=true

使用mysqlreader–>streamwriter

cd target/datax/datax/bin
python datax.py -r mysqlreader -w streamwriter
python datax.py ./mysql_2_local.json

执行最后有个统计信息

2017-10-31 20:34:16.229 [job-0] INFO  JobContainer - PerfTrace not enable!
2017-10-31 20:34:16.230 [job-0] INFO  StandAloneJobContainerCommunicator - Total 408147 records, 9795528 bytes | Speed 956.59KB/s, 40814 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 4.531s |  All Task WaitReaderTime 0.284s | Percentage 100.00%
2017-10-31 20:34:16.231 [job-0] INFO  JobContainer - 
任务启动时刻                    : 2017-10-31 20:34:05
任务结束时刻                    : 2017-10-31 20:34:16
任务总计耗时                    :                 11s
任务平均流量                    :          956.59KB/s
记录写入速度                    :          40814rec/s
读出记录总数                    :              408147
读写失败总数                    :                   0

得到配置文件模板

最后的配置文件

cat mysql_2_local.json 
{
  "job": {
    "setting": {
      "speed": {
        "channel": 1
      },
      "errorLimit": {
        "record": 0,
        "percentage": 0.02
      }
    },
    "content": [
      {
        "reader": {
          "name": "mysqlreader",
          "parameter": {
            "username": "canal",
            "password": "canal",
            "column": [
              "X",
              "a",
              "b"
            ],
            "splitPk": "X",
            "connection": [
              {
                "table": [
                  "xdual10"
                ],
                "database": [
                  "test"
                ],
                "jdbcUrl": [
                  "jdbc:mysql://172.28.3.26:3306/test"
                ]
              }
            ]
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "print": true
          }
        }
      }
    ]
  }
}

mysql一致性快照导出方式

ali的datax不支持binlog位点方式导出,一直想要这个功能,最开始想用sqoop来实现, 奈何是mr,实现了意义也不大,

参考DBZ 支持mysql一致性快照mysql导出,后端可以支持到其他存储。